No config defaults changed since last commit.
| Parameter | Value |
|---|---|
total_samples | 10000000 |
batch_size | 8 |
stage_samples_multiplier | 100000000000 |
update_interval | 250 |
window_size | 100 |
num_best_models_to_keep | 1 |
sampling_mode | Loss-weighted |
loss_weight_temperature | 0.5 |
loss_weight_refresh_interval | 50 |
stop_on_divergence | True |
divergence_gap | 0.002 |
divergence_ratio | 1.5 |
divergence_patience | 50 |
divergence_min_updates | 10 |
val_spike_threshold | 2.0 |
val_spike_window | 15 |
val_spike_frequency | 0.75 |
val_plateau_patience | 250 |
val_plateau_min_delta | 0.0001 |
custom_lr | 0.0001 |
disable_lr_scaling | True |
custom_warmup | -1 |
lr_min_ratio | 0.001 |
resume_warmup_ratio | 0.05 |
plateau_factor | 0.8 |
plateau_patience | 15 |
preserve_optimizer | False |
preserve_scheduler | True |
samples_mode | Train additional samples |
num_random_obs_to_visualize | 2 |
selected_frame_offset | 3 |
runs_per_stage | 5 |
serial_runs | True |
clean_old_checkpoints | True |
enable_baseline | False |
baseline_runs_per_stage | 1 |
run_id | shoulder_session_multiheight_decoder_only_wide_no_mask_variation |
seed | None |
enable_wandb | True |
wandb_project | developmental-robot-movement |
lr_sweep.lr_min | 1e-07 |
lr_sweep.lr_max | 0.01 |
lr_sweep.phase_a_num_candidates | 5 |
lr_sweep.phase_a_seeds | 1 |
lr_sweep.phase_a_time_budget_min | 3.0 |
lr_sweep.phase_a_survivor_count | 2 |
lr_sweep.phase_b_seeds | 3 |
lr_sweep.phase_b_time_budget_min | 10.0 |
lr_sweep.ranking_metric | median_best_val |
lr_sweep.min_samples_before_timeout | 1000 |
lr_sweep.min_evals_before_stop | 5 |
lr_sweep.save_sweep_state | True |
plateau_sweep.enabled | True |
plateau_sweep.plateau_ema_alpha | 0.85 |
plateau_sweep.plateau_improvement_threshold | 0.0015 |
plateau_sweep.plateau_patience | 25 |
plateau_sweep.cooldown_updates | 5 |
plateau_sweep.max_sweeps_per_stage | 2 |
plateau_sweep.min_sweep_improvement | 0.0 |
initial_sweep_enabled | True |
stage_time_budget_min | 180 |
| Parameter | Value |
|---|---|
AUTOENCODER_LR | 0.0002 |
BATCH_SIZE | 1 |
CANVAS_HISTORY_SIZE | 3 |
DECODER_DEPTH | 8 |
DECODER_EMBED_DIM | 384 |
DECODER_NUM_HEADS | 6 |
DECODER_ONLY_DEPTH | 10 |
EMBED_DIM | 384 |
ENCODER_DEPTH | 5 |
FOCAL_BETA | 5 |
FOCAL_LOSS_ALPHA | 0.1 |
FRAME_SIZE | (224, 224) |
GRADIO_UPDATE_INTERVAL | 1 |
LR_MIN_RATIO | 0.001 |
MODEL_TYPE | decoder_only |
NUM_HEADS | 6 |
PATCH_SIZE | 16 |
PERCEPTUAL_LOSS_WEIGHT | 0 |
SEPARATOR_WIDTH | 16 |
WARMUP_STEPS | 1000 |
WEIGHT_DECAY | 0.01 |
MASK_RATIO_MIN | 1 |
MASK_RATIO_MAX | 1 |
TRAIN_MASK_RATIO_MIN | 1.0 |
TRAIN_MASK_RATIO_MAX | 1.0 |
| Stage | Plateau Sweeps | Sweep Time | Training Time | Stage Total |
|---|---|---|---|---|
| Stage 1 | 5 | 01:21:47 | 00:16:11 | 01:37:58 |
| TOTAL | 5 | 01:21:47 | 00:16:11 | 01:37:58 |
Initial LR Sweep: Stage 1: selected LR 5.62e-04 in 00:16:19
LR Progression: 5.6e-04 → 3.2e-05 → 3.2e-05 → 3.2e-05 → 3.2e-05 → 1.8e-06
| Sweep # | Triggered At (samples) | Wall Time | Selected LR | Duration |
|---|---|---|---|---|
| 1 | 21,248 | 00:05:09 | 3.16e-05 | 00:16:22 |
| 2 | 27,904 | 00:23:09 | 3.16e-05 | 00:16:31 |
| 3 | 35,328 | 00:41:32 | 3.16e-05 | 00:17:04 |
| 4 | 43,776 | 01:00:43 | 3.16e-05 | 00:15:56 |
| 5 | 58,880 | 01:20:25 | 1.78e-06 | 00:15:53 |
| Stage | Best Loss | Stop Reason | Samples Trained | Time | Sweeps | LR (Initial→Final) |
|---|---|---|---|---|---|---|
| Stage 1 | 0.071011 | max_sweeps (2) | 6,656 | 01:37:58 | 5 | 5.6e-04→1.8e-06 |
Total Plateau Sweeps: 5
| Run | Best Loss | Stop Reason | Samples | Time | Selected |
|---|---|---|---|---|---|
| 1 | 0.087918 | max_sweeps (2) | 9,728 | 01:45:45 | |
| 2 | 0.111525 | max_sweeps (2) | 6,656 | 01:05:15 | |
| 3 | 0.071011 | max_sweeps (2) | 6,656 | 01:37:58 | ✓ |
| 4 | 0.083945 | max_sweeps (2) | 15,616 | 00:46:31 | |
| 5 | 0.110440 | max_sweeps (2) | 13,568 | 00:43:26 | |
| Mean: 0.092968 ± 0.015739 | Min: 0.071011 / Max: 0.111525 | Range: 0.040513 | |||
| Stage | Orig Loss | Train Loss | Time | Samples | Stop Reason |
|---|---|---|---|---|---|
| 1 ⭐ | 0.083423 | 0.071011 | 01:37:58 | 6656 | max_sweeps (2) |